AITopics | translation memory

Collaborating Authors

translation memory

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Improving Retrieval-Augmented Neural Machine Translation with Monolingual Data

Bouthors, Maxime, Crego, Josep, Yvon, François

arXiv.org Artificial IntelligenceOct-2-2025

Conventional retrieval-augmented neural machine translation (RANMT) systems leverage bilingual corpora, e.g., translation memories (TMs). Yet, in many settings, monolingual corpora in the target language are often available. This work explores ways to take advantage of such resources by directly retrieving relevant target language segments, based on a source-side query. For this, we design improved cross-lingual retrieval systems, trained with both sentence level and word-level matching objectives. In our experiments with three RANMT architectures, we assess such cross-lingual objectives in a controlled setting, reaching performances that match those of standard TM-based models. We also showcase our method on a real-world settings, using much larger monolingual and observe strong improvements over both the baseline setting and general-purpose cross-lingual retrievers.

computational linguistic, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2504.21747

Country:

Europe (1.00)
Asia (1.00)
North America > United States > Minnesota (0.28)

Genre:

Research Report > New Finding (0.48)
Research Report > Experimental Study (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

TULUN: Transparent and Adaptable Low-resource Machine Translation

Merx, Raphaël, Suominen, Hanna, Hong, Lois, Thieberger, Nick, Cohn, Trevor, Vylomova, Ekaterina

arXiv.org Artificial IntelligenceMay-27-2025

Machine translation (MT) systems that support low-resource languages often struggle on specialized domains. While researchers have proposed various techniques for domain adaptation, these approaches typically require model fine-tuning, making them impractical for non-technical users and small organizations. To address this gap, we propose Tulun, a versatile solution for terminology-aware translation, combining neural MT with large language model (LLM)-based post-editing guided by existing glossaries and translation memories. Our open-source web-based platform enables users to easily create, edit, and leverage terminology resources, fostering a collaborative human-machine translation process that respects and incorporates domain expertise while increasing MT accuracy. Evaluations show effectiveness in both real-world and benchmark scenarios: on medical and disaster relief translation tasks for Tetun and Bislama, our system achieves improvements of 16.90-22.41 ChrF++ points over baseline MT systems. Across six low-resource languages on the FLORES dataset, Tulun outperforms both standalone MT and LLM approaches, achieving an average improvement of 2.8 ChrF points over NLLB-54B.

large language model, natural language, translation, (17 more...)

arXiv.org Artificial Intelligence

2505.18683

Country:

Europe (1.00)
North America > United States (0.68)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Predicting Anchored Text from Translation Memories for Machine Translation Using Deep Learning Methods

Yue, Richard, Ortega, John E.

arXiv.org Artificial IntelligenceSep-26-2024

Translation memories (TMs) are the backbone for professional translation tools called computer-aided translation (CAT) tools. In order to perform a translation using a CAT tool, a translator uses the TM to gather translations similar to the desired segment to translate (s'). Many CAT tools offer a fuzzy-match algorithm to locate segments (s) in the TM that are close in distance to s'. After locating two similar segments, the CAT tool will present parallel segments (s, t) that contain one segment in the source language along with its translation in the target language. Additionally, CAT tools contain fuzzy-match repair (FMR) techniques that will automatically use the parallel segments from the TM to create new TM entries containing a modified version of the original with the idea in mind that it will be the translation of s'. Most FMR techniques use machine translation as a way of "repairing" those words that have to be modified. In this article, we show that for a large part of those words which are anchored, we can use other techniques that are based on machine learning approaches such as Word2Vec. BERT, and even ChatGPT. Specifically, we show that for anchored words that follow the continuous bag-of-words (CBOW) paradigm, Word2Vec, BERT, and GPT-4 can be used to achieve similar and, for some cases, better results than neural machine translation for translating anchored words from French to English.

machine translation, translation, word2vec, (15 more...)

arXiv.org Artificial Intelligence

2409.17939

Country:

North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > California > Santa Clara County > San Jose (0.04)
(12 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Optimizing example selection for retrieval-augmented machine translation with translation memories

Bouthors, Maxime, Crego, Josep, Yvon, François

arXiv.org Artificial IntelligenceMay-23-2024

Retrieval-augmented machine translation leverages examples from a translation memory by retrieving similar instances. These examples are used to condition the predictions of a neural decoder. We aim to improve the upstream retrieval step and consider a fixed downstream edit-based model: the multi-Levenshtein Transformer. The task consists of finding a set of examples that maximizes the overall coverage of the source sentence. To this end, we rely on the theory of submodular functions and explore new algorithms to optimize this coverage. We evaluate the resulting performance gains for the machine translation task.

computational linguistic, machine translation, translation, (12 more...)

arXiv.org Artificial Intelligence

2405.1507

Country:

North America > Canada > Ontario > Toronto (0.04)
Asia > Singapore (0.04)
North America > United States > Maryland > Baltimore (0.04)
(15 more...)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Retrieving Examples from Memory for Retrieval Augmented Neural Machine Translation: A Systematic Comparison

Bouthors, Maxime, Crego, Josep, Yvon, Francois

arXiv.org Artificial IntelligenceApr-3-2024

Retrieval-Augmented Neural Machine Translation (RAMT) architectures retrieve examples from memory to guide the generation process. While most works in this trend explore new ways to exploit the retrieved examples, the upstream retrieval step is mostly unexplored. In this paper, we study the effect of varying retrieval methods for several translation architectures, to better understand the interplay between these two processes. We conduct experiments in two language pairs in a multi-domain setting and consider several downstream architectures based on a standard autoregressive model, an edit-based model, and a large language model with in-context learning. Our experiments show that the choice of the retrieval technique impacts the translation scores, with variance across architectures. We also discuss the effects of increasing the number and diversity of examples, which are mostly positive across the board.

computational linguistic, machine translation, translation, (13 more...)

arXiv.org Artificial Intelligence

2404.02835

Country:

North America > Canada > Ontario > Toronto (0.04)
Asia > Singapore (0.04)
North America > United States > Maryland > Baltimore (0.04)
(17 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Improving Neural Machine Translation by Multi-Knowledge Integration with Prompting

Wang, Ke, Xie, Jun, Zhang, Yuqi, Zhao, Yu

arXiv.org Artificial IntelligenceDec-7-2023

Improving neural machine translation (NMT) systems with prompting has achieved significant progress in recent years. In this work, we focus on how to integrate multi-knowledge, multiple types of knowledge, into NMT models to enhance the performance with prompting. We propose a unified framework, which can integrate effectively multiple types of knowledge including sentences, terminologies/phrases and translation templates into NMT models. We utilize multiple types of knowledge as prefix-prompts of input for the encoder and decoder of NMT models to guide the translation process. The approach requires no changes to the model architecture and effectively adapts to domain-specific translation without retraining. The experiments on English-Chinese and English-German translation demonstrate that our approach significantly outperform strong baselines, achieving high translation quality and terminology match accuracy.

knowledge, machine translation, translation, (15 more...)

arXiv.org Artificial Intelligence

2312.04807

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Europe > Italy > Tuscany > Florence (0.04)
(11 more...)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Synslator: An Interactive Machine Translation Tool with Online Learning

Wang, Jiayi, Wang, Ke, Zhou, Fengming, Wang, Chengyu, Fu, Zhiyong, Feng, Zeyu, Zhao, Yu, Zhang, Yuqi

arXiv.org Artificial IntelligenceOct-8-2023

Interactive machine translation (IMT) has emerged as a progression of the computer-aided translation paradigm, where the machine translation system and the human translator collaborate to produce high-quality translations. This paper introduces Synslator, a user-friendly computer-aided translation (CAT) tool that not only supports IMT, but is adept at online learning with real-time translation memories. To accommodate various deployment environments for CAT services, Synslator integrates two different neural translation models to handle translation memories for online learning. Additionally, the system employs a language model to enhance the fluency of translations in an interactive mode. In evaluation, we have confirmed the effectiveness of online learning through the translation models, and have observed a 13% increase in post-editing efficiency with the interactive functionalities of Synslator. A tutorial video is available at:https://youtu.be/K0vRsb2lTt8.

machine translation, synslator, translation, (16 more...)

arXiv.org Artificial Intelligence

2310.05025

Country:

North America > United States > Texas > Travis County > Austin (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Iceland > Capital Region > Reykjavik (0.04)
Asia > Indonesia > Bali (0.04)

Genre:

Instructional Material (0.68)
Research Report (0.50)

Industry: Education > Educational Setting > Online (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Rethinking Translation Memory Augmented Neural Machine Translation

Hao, Hongkun, Huang, Guoping, Liu, Lemao, Zhang, Zhirui, Shi, Shuming, Wang, Rui

arXiv.org Artificial IntelligenceJun-12-2023

This paper rethinks translation memory augmented neural machine translation (TM-augmented NMT) from two perspectives, i.e., a probabilistic view of retrieval and the variance-bias decomposition principle. The finding demonstrates that TM-augmented NMT is good at the ability of fitting data (i.e., lower bias) but is more sensitive to the fluctuations in the training data (i.e., higher variance), which provides an explanation to a recently reported contradictory phenomenon on the same translation task: TM-augmented NMT substantially advances vanilla NMT under the high-resource scenario whereas it fails under the low-resource scenario. Then we propose a simple yet effective TM-augmented NMT model to promote the variance and address the contradictory phenomenon. Extensive experiments show that the proposed TM-augmented NMT achieves consistent gains over both conventional NMT and existing TM-augmented NMT under two variance-preferable (low-resource and plug-and-play) scenarios as well as the high-resource scenario.

artificial intelligence, computational linguistic, natural language, (16 more...)

arXiv.org Artificial Intelligence

2306.06948

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > China > Shanghai > Shanghai (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
(19 more...)

Genre: Research Report > New Finding (0.87)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Neural Machine Translation with Contrastive Translation Memories

Cheng, Xin, Gao, Shen, Liu, Lemao, Zhao, Dongyan, Yan, Rui

arXiv.org Artificial IntelligenceDec-6-2022

Retrieval-augmented Neural Machine Translation models have been successful in many translation scenarios. Different from previous works that make use of mutually similar but redundant translation memories~(TMs), we propose a new retrieval-augmented NMT to model contrastively retrieved translation memories that are holistically similar to the source sentence while individually contrastive to each other providing maximal information gains in three phases. First, in TM retrieval phase, we adopt a contrastive retrieval algorithm to avoid redundancy and uninformativeness of similar translation pieces. Second, in memory encoding stage, given a set of TMs we propose a novel Hierarchical Group Attention module to gather both local context of each TM and global context of the whole TM set. Finally, in training phase, a Multi-TM contrastive learning objective is introduced to learn salient feature of each TM with respect to target sentence. Experimental results show that our framework obtains improvements over strong baselines on the benchmark datasets.

artificial intelligence, natural language, translation memory, (15 more...)

arXiv.org Artificial Intelligence

2212.0314

Country:

Europe > Italy > Trentino-Alto Adige/Südtirol > Trentino Province > Trento (0.04)
Asia > South Korea (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Towards Robust k-Nearest-Neighbor Machine Translation

Jiang, Hui, Lu, Ziyao, Meng, Fandong, Zhou, Chulun, Zhou, Jie, Huang, Degen, Su, Jinsong

arXiv.org Artificial IntelligenceOct-17-2022

k-Nearest-Neighbor Machine Translation (kNN-MT) becomes an important research direction of NMT in recent years. Its main idea is to retrieve useful key-value pairs from an additional datastore to modify translations without updating the NMT model. However, the underlying retrieved noisy pairs will dramatically deteriorate the model performance. In this paper, we conduct a preliminary study and find that this problem results from not fully exploiting the prediction of the NMT model. To alleviate the impact of noise, we propose a confidence-enhanced kNN-MT model with robust training. Concretely, we introduce the NMT confidence to refine the modeling of two important components of kNN-MT: kNN distribution and the interpolation weight. Meanwhile we inject two types of perturbations into the retrieved pairs for robust training. Experimental results on four benchmark datasets demonstrate that our model not only achieves significant improvements over current kNN-MT models, but also exhibits better robustness. Our code is available at https://github.com/DeepLearnXMU/Robust-knn-mt.

datastore, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2210.08808

Country:

Asia > China > Fujian Province > Xiamen (0.04)
Asia > China > Liaoning Province > Dalian (0.04)

Genre: Research Report > Experimental Study (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (1.00)

Add feedback